The following Snippet, written up and executed as a stand-alone php file (can also be executed in the run-php dev module box), imports into two related CCK nodes, selected fields from a CSV (actually tab delimited, but that is parameterized in the excellent drupal function) data file (in this case, exported from ACT).
I have tested the snippet on various platforms (php 4/5, mysql 4.1/5), always with Drupal 5.0 rc1.
I asked KarenS if it would be useful, and she suggested I post it here, in the hopes of there emerging a discussion to make an import/export module (or enhancing the existing SoC import/export module which hasn't been updated for Drupal 5 yet).
I am posting it "AS IS", so please excuse the fact that some of the fields, etc., are in Spanish. "empresa" means company, and "contacto" means contact. And "insertar_contenido" means insert content. If someone needs this in English, let me know, in the interests of speed, I am offering it as is.
Here goes (please post comments if you have any questions). I have based the code on moshe weitzman's style used in the utilities provided with the devel module (formatting thanks to VIM Convert to HTML):
<?php
// Written by Victor Kane - victorkane at awebfactory dot com dot ar
// *** This script affects your database
//
// CONFIGURATION
// Change the value below to TRUE when you want to run the script After running, immediately
// change back to FALSE in order to prevent accidentally executing this script twice.
$active = FALSE;
// CODE
include_once "includes/bootstrap.inc";
include_once("includes/common.inc");
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
$content_type = 'contacto';
$content_name = 'Contacto';
$archivo_datos = 'backup/test.csv';
if ($active) {
if (user_access("administer content types")) {
insertar_contenido($content_type, $archivo_datos);
} else {
print "No tiene permisos para administer content types";
}
} else {
print "exportar-empresas no ha sido activado. Ver variable $active al principio del código fuente";
}
function insertar_contenido ($content_type, $archivo_datos) {
$handle = fopen($archivo_datos, "r");
$theHeaders = fgetcsv ($handle, 4096, "\t");
$lineno = 0;
while ($line = fgetcsv ($handle, 4096, "\t")) {
$output .= '';
$valueno = 0;
$lineno++;
$output .= '<ul>Línea: '.$lineno;
$observaciones = '';
foreach ($line as $value) {
if ($value) {
$output .= '<li>'.$theHeaders[$valueno].': '.$value.'</li>';
$observaciones .= '<li>'.$theHeaders[$valueno].': '.$value.'</li>';
}
$valueno++;
}
$output .= '</ul>';
$node = array();
$node['title'] = $line[3];
$node['body'] = $observaciones;
$node['type'] = $content_type;
$node['format'] = 3;
//$node['taxonomy'] = $_REQUEST['taxonomy'];
$node['name'] = $content_name;
//$node['date'] = $_REQUEST['date'];
$node['status'] = 1;
$node['promote'] = 0;
$node['sticky'] = 0;
$log = 'Importado por importar-contactos el ' . date('g:i:s a');
$node['log'] = $log;
$node['field_empresa'] = array (
0 => array(
'nid' => db_result(db_query("SELECT nid FROM {node} WHERE title = '%s'", $line[2])),
),
);
$node['field_cargo'] = array (
0 => array(
'value' => $line[22],
),
);
$node['field_saludo'] = array (
0 => array(
'value' => $line[17],
),
);
$node['field_nombres'] = array (
0 => array(
'value' => $line[54],
),
);
$node['field_apellidos'] = array (
0 => array(
'value' => $line[55],
),
);
$node['field_mvil'] = array (
0 => array(
'value' => $line[15],
),
);
$node['field_telfono_particular'] = array (
0 => array(
'value' => $line[14],
),
);
$node['field_buscapersonas'] = array (
0 => array(
'value' => $line[16],
),
);
$node['field_e_mail'] = array (
0 => array(
'value' => $line[89],
),
);
/*
// this code, from the autosave module, is unnecessary, since node_save will do it for us :)
$node['nid'] = db_result(db_query("SELECT id FROM {sequences} WHERE name = '%s'", 'node_nid')) + 1;
$node['vid'] = db_result(db_query("SELECT id FROM {sequences} WHERE name = '%s'", 'node_revisions_vid')) + 1;
*/
if ($node['title']) {
$node = (object)$node;
$node = node_submit($node);
node_save($node);
print $output;
$nid = $node->nid;
$tid = 6;
db_query("INSERT INTO {term_node} (nid, tid) VALUES (%d, %d)", $nid, $tid);
}
}
}
Notes:
1. Before this code is run, a very similar and script which is a subset of this one is run, so that the "empresas" (companies) nodes are created first.
2. Then, the line
'nid' => db_result(db_query("SELECT nid FROM {node} WHERE title = '%s'", $line[2])),
can grab the appropriate node id to stash in the node reference field "field_empresa" in the contacto node.
3. The body, called observations (observaciones) is used as a stash to place every field that has some data, to test the results.
4. By the way, in the CCK handbook, KarenS affirms in her snippet that CCK doesn't use body, and creates an alternative field; but I didn't experience that in this case (did that change for 5.0 or something?).
OK, if anyone needs any further info or explanations, just yell; hope this helps someone.
Victor Kane
http://awebfactory.com.ar

Comments
Just to clarify, Victor
Just to clarify, Victor described briefly what he had put together and I suggested he post it here for discussion so we can see exactly what he has and how it could fit in. I'm not necessarily advocating that we replace any other module, more that we see where this fits in best. I was curious about how he was importing from csv since that could be a nice tool to add to CCK one way or another. I'm not sure how much this overlaps with the ImportExportAPI either. Victor, have you investigated that module??
And, yes, in 5.x we now have an optional body field for all content types. That was not true in 4.7 which did not use the body field. Sounds like the handbook needs another update :-)
will check into status of export/import module
I did some testing of that module a few months ago, and looked at the code, looks like it could do the same job as my snippet, assuming it is CCK friendly (don't remember that at all, but it might be more CCK friendly than my snippet, considering the vocabulary hack at the bottom :) ) except it isn't available for 5.0, so would have to look into that, maybe find out from the author his road map on that.
Hope the code is useful for someone, and I will report on what I see on the existing import/export module.
Victor Kane
http://awebfactory.com.ar
Victor Kane
http://awebfactory.com
Great work! Also there is
Great work!
Also there is Node Import that works with CCK, but not presently working with 5.0
Node Import Port to 5.x
http://drupal.org/node/97808
Darly
I saw on the issue queue
I saw on the issue queue that Jaza is looking for a new maintainer for the importexportapi module, so that one may be in limbo for now and it is quite complex. So I took another look at the node_import module. The CCK import needed some work, but I posted some things that got it working quite nicely at http://drupal.org/node/105982. As mentioned above though, it is not yet ported to 5.0.
As Drupal 5.0 core and
As Drupal 5.0 core and modules development are moving at lightning speed; Port 5.0 compatibility is becoming a priority. Probably 4.7.x issues are being resolved before jumping into the 5.0 train.
Node Import for Drupal 5.x
Seems to be up and running now:
http://drupal.org/project/node_import
Seems like there are now a few paths to do this now. Thanks!
OpenConcept | WLP | FVC | OX | OO
--
OpenConcept | Twitter @mgifford | Drupal Security Guide
Let taxonomy_nodeapi() do its thing
Victor,
It might be better to let taxonomy insert into the term_node table. You should be able to remove:
$nid = $node->nid;$tid = 6;
db_query("INSERT INTO {term_node} (nid, tid) VALUES (%d, %d)", $nid, $tid);
and, instead, right before calling node_save() say this:
$node->taxonomy = array(6);Also, is it necessary to call node_submit() before node_save()? I haven't messed too much with CCK, but it doesn't seem like you'd have to.
Marc
http://www.funnymonkey.com
Tools for Teachers
great idea, marc
So, making an adjustment in the code (taking into account that node is an array and not an object until we make it so, just before node_save), the full snippet works perfectly as follows (notice that we have inserted the assignment to node['taxonomy'], and commented out the previous call to db_query after the call to node_save, which now requires nothing further):
<?php // Written by Victor Kane - victorkane at awebfactory dot com dot ar // *** This script affects your database // // CONFIGURATION // Change the value below to TRUE when you want to run the script After running, immediately // change back to FALSE in order to prevent accidentally executing this script twice. $active = TRUE; // CODE include_once "includes/bootstrap.inc"; include_once("includes/common.inc"); drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL); $content_type = 'contacto'; $content_name = 'Contacto'; $archivo_datos = 'backup/test.csv'; if ($active) { if (user_access("administer content types")) { insertar_contenido($content_type, $archivo_datos); } else { print "No tiene permisos para administer content types"; } } else { print "exportar-empresas no ha sido activado. Ver variable $active al principio del código fuente"; } function insertar_contenido ($content_type, $archivo_datos) { $handle = fopen($archivo_datos, "r"); $theHeaders = fgetcsv ($handle, 4096, "\t"); $lineno = 0; while ($line = fgetcsv ($handle, 4096, "\t")) { $output .= ''; $valueno = 0; $lineno++; $output .= '<ul>Línea: '.$lineno; $observaciones = ''; foreach ($line as $value) { if ($value) { $output .= '<li>'.$theHeaders[$valueno].': '.$value.'</li>'; $observaciones .= '<li>'.$theHeaders[$valueno].': '.$value.'</li>'; } $valueno++; } $output .= '</ul>'; $node = array(); $node['title'] = $line[3]; $node['body'] = $observaciones; $node['type'] = $content_type; $node['format'] = 3; //$node['taxonomy'] = $_REQUEST['taxonomy']; $node['name'] = $content_name; //$node['date'] = $_REQUEST['date']; $node['status'] = 1; $node['promote'] = 0; $node['sticky'] = 0; $log = 'Importado por importar-contactos el ' . date('g:i:s a'); $node['log'] = $log; $node['field_empresa'] = array ( 0 => array( 'nid' => db_result(db_query("SELECT nid FROM {node} WHERE title = '%s'", $line[2])), ), ); $node['field_cargo'] = array ( 0 => array( 'value' => $line[22], ), ); $node['field_saludo'] = array ( 0 => array( 'value' => $line[17], ), ); $node['field_nombres'] = array ( 0 => array( 'value' => $line[54], ), ); $node['field_apellidos'] = array ( 0 => array( 'value' => $line[55], ), ); $node['field_mvil'] = array ( 0 => array( 'value' => $line[15], ), ); $node['field_telfono_particular'] = array ( 0 => array( 'value' => $line[14], ), ); $node['field_buscapersonas'] = array ( 0 => array( 'value' => $line[16], ), ); $node['field_e_mail'] = array ( 0 => array( 'value' => $line[89], ), ); // Marc of funnymonkey.com suggestion: the integer '2' is tid $node['taxonomy'] = array(2); /* // this code, from the autosave module, is unnecessary, since node_save will do it for us :) $node['nid'] = db_result(db_query("SELECT id FROM {sequences} WHERE name = '%s'", 'node_nid')) + 1; $node['vid'] = db_result(db_query("SELECT id FROM {sequences} WHERE name = '%s'", 'node_revisions_vid')) + 1; */ if ($node['title']) { $node = (object)$node; $node = node_submit($node); node_save($node); print $output; /* $nid = $node->nid; $tid = 6; db_query("INSERT INTO {term_node} (nid, tid) VALUES (%d, %d)", $nid, $tid); */ } } }and afterwards, using the devel module, we can see the value has been placed accordingly when we hit the "Devel load" tab, and see:
Array
(
[4] => stdClass Object
(
[tid] => 4
[vid] => 3
[name] => Nuevo
[description] =>
[weight] => 0
)
)
As for the call to node_submit, I included it because when I saw it included in the autosave module, I checked out the CCK code, and it looked like there was a lot of housekeeping done possibly on certain kinds of CCK objects, so... I may be wrong but I went ahead and included it. It does work without that call, though.
Thanks, once again, Marc!
Victor Kane
http://awebfactory.com.ar
Victor Kane
http://awebfactory.com
Import CSV Data file
@ Victor: Great contrib.
An Example source of CSV Data file. Please provide file example or format.
Should the file include headers? What are about column numbers?
Darly
All the sample data is in Spanish
I put it up the way it was because I was concerned that it should be "real running code" and I haven't had time to make a simple more straightforward example.
Headers are cool, what I did was:
<?php$theHeaders = fgetcsv ($handle, 4096, "\t");
$lineno = 0;
while ($line = fgetcsv ($handle, 4096, "\t")) {
?>
The CSV in this case happens to be weird, since it is tab delimited without "commas" (courtesy of ACT!, from which I had to import the data into Drupal).
Notice the specification of the non-default tab character in my fgetcsv call (default is comma).
It should be pretty easy to simplify the code to fewer columns in a more classic dbf, foxbase, or excel like export.
Here's the reference PHP manual page:
http://www.php.net/fgetcsv
If I have time, I will try to do that if no-one (hopefully) beats me to it :)
Victor Kane
http://awebfactory.com.ar
Victor Kane
http://awebfactory.com
Writing uid field in newly created node
Everything in the script is working well - except the uid for the newly created nodes is staying as 0 (default) and not the value I have in the CSV. Any ideas?
Thank you.
Mitch
RE: Writing uid field in newly created node
I discovered the answer to this by trial and error. Try adding this:
$node['name'] = $username;...where
$usernameis the name of the user doing the import, as seen in the {users} table. That did the trick for me;$node['uid']appeared to have no effect. Note that I am also using a slightly different approach to the import than the script referenced in this thread. My approach is to generate a "faux" form and then submit it withdrupal_execute(). The whole thing is packaged in a custom module so I get the benefit of an upload form, validation, etc., for the user who will be doing the uploading.--Buzz
Modified version of Victor's script
Howdy all - I got stymied by the lack of a 5.0 branch for the node_import module, and the importexportapi module struck me as opaque, so I am working off of Victor's script to do some node importing into Drupal 5.
Here is my significantly modified - and I hope, improved - version. Translated to English, with a different method for printing debug information, and a different method for translating the CSV columns into node attributes. With this version you must include a header row, and name the columns exactly after $node object attributes, like title, body etc.. It is thus more flexible, but not totally flexible, as I have not implemented all possible fields.
It does not automatically recognize your CCK custom field names, you'll still have to modify it for that.
<?php
// Original written by Victor Kane - victorkane at awebfactory dot com dot ar
// Modifications by Jesse Mortenson - aka greenmachine on drupal.org
// *** This script affects your database
// CONFIGURATION
// Change the value below to TRUE when you want to run the script After running, immediately
// change back to FALSE in order to prevent accidentally executing this script twice.
$active = FALSE;
//This modified script inputs data based upon the first row in your CSV file, the header row. Your headers must match the fields of the node object in order for any data to be saved. For example, you should have columns for title, body, and type (type is the node type).
//Look at the translate_value function to see field types I've already accounted for. You will need to match the key values for special things like organic groups and CCK.
//If you're using CCK, make sure to include YOUR custom field machine-readable field names in the list of cases. For most fields, you can just include them in the big list of cases. If the field needs to be submitted with something other than the [0] => [value] => "value" format, you need to add a special case.
// CODE
include_once("includes/bootstrap.inc");
include_once("includes/common.inc");
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
$archive_data = 'files/data_file.csv'; // must point to your data file, which must have proper header row
print "Script working<br />";
if ($active) {
if (user_access("administer content types")) {
print "Processing begins \n";
print '<pre>';
$nodes_array = insert_content($archive_data);
print_r($nodes_array); // prints the array of nodes the script gathered from your file, for debugging purposes only
print '</pre>';
} else {
print "You don't have permission to administer this content type";
}
} else {
print "Import script is not active. Change the active variable to TRUE to activate";
}
// this function translates a cell in your CSV file and inserts it into the $node array
// the postpone variable is for fields that should be added to the node object AFTER _validate and _submit for whatever reason; I use it here for organic groups
function translate_value($key, $value, $headers, &$node, &$postpone) {
switch ($headers[$key]) {
// this first batch is generic node stuff
case 'title':
case 'body':
case 'created':
$node[$headers[$key]] = $value;
break;
case 'uid': // using UID for author, then loads username based on that
$node['uid'] = $value;
$user = user_load(array('uid' => $node['uid']));
$node['name'] = $user->name;
break;
$node['body'] = $value;
break;
case 'type':
$node['type'] = $value;
switch ($node['type']) { // optional: I have comment settings conditional on type
case 'forum':
case 'blog':
case 'story':
$node['comment'] = 2;
break;
}
break;
// taxonomy should be a comma separated list in the CSV file
case 'taxonomy':
$node['taxonomy'] = explode(',', $value);
break;
// CCK stuff here, these will need to be changed if your nodes use CCK
case 'field_company':
case 'field_title':
case 'field_article_name':
case 'field_media_outlet':
case 'field_author':
case 'field_notes':
$node[$headers[$key]] = array(array('value' => $value));
break;
case 'field_link':
$node['field_link'] = array(array('url' => $value, 'title' => '', 'attributes' => 'N;'));
break;
case 'field_date_published':
$node['field_date_published'] = array(array('value' => array(
'mon' => substr($value,5,2),
'mday' =>substr($value,8,2),
'year' => substr($value,0,4),
'hour' => 0,
'minute' => 0,
)));
break;
// Organic groups values. Group ID numbers should be a comma-separated list in the CSV
case 'og_groups':
$postpone['og_groups'] = explode(',', $value);
foreach ($postpone['og_groups'] as $gid) {
$gnode = node_load($gid);
$postpone['og_groups_names'][] = $gnode->title;
}
break;
default:
break;
}
// nothing is returned because the $node and $postpone arguments are accepted by reference
}
// this function actually inserts content. See opportunities to comment out the two lines that actually change your database. With those commented out, you just get debug data to look at.
function insert_content ($archive_data) {
$handle = fopen($archive_data, "r");
$headers = fgetcsv ($handle, 4096);
//print_r($theHeaders);
while ($line = fgetcsv ($handle, 4096)) {
//print_r($line);
$node = array();
$postpone = array();
foreach ($line as $key => $value) {
translate_value($key, $value, $headers, $node, $postpone);
}
// defaults for all imported nodes. feel free to change
$node['format'] = 3;
$node['status'] = 1;
$node['promote'] = 0;
$node['sticky'] = 0;
$log = 'Imported node using import script at ' . date('g:i:s a');
$node['log'] = $log;
if ($node['title']) {
$node = (object)$node;
$nodes_array['pre'][] = $node; // adding to node array for debug info
node_validate($node);
$error = form_get_errors();
if (!$error) {
$node = node_submit($node); // comment this out to test first before changing the database
foreach ($postpone as $key => $value) {
switch ($key) {
case 'og_groups':
$node->og_public = 0;
$node->og_groups = $value;
break;
case 'og_groups_names':
$node->og_public = 0;
$node->og_groups_names = $value;
break;
}
}
node_save($node); // comment this out to test first before changing the database
} else {
print_r($error);
}
$nodes_array['post'][] = $node; // adding to "post" node array for debug info
}
} // end while
/*
The nodes_array variable includes two arrays: pre and post. This shows you how your nodes look before Drupal takes a look at them (pre), and after the node submission process takes place (post). This helps to debug any problems you're having with field values being stored inccorectly
Of course, if you commented out the node_submit and node_save comamnds, (post) will not be as useful.
*/
return $nodes_array;
} // end function
?>
Updating data in CCK fields
I've figured out almost the same way to import data into my custom CCK node type. I have a data input in XML. I need to turn this XML dataset to Drupal CCK nodes and update the imported nodes if neccessary to decrease processing time.
The importing part of my scenario is done well, but I have problems with updating the previously imported nodes.
The weird problem is:
1. on update, I get doubled rows in my CCK related tables with VID = 0 only in case of the first updated CCK node!
2. in {node} table the VID is also set to zero after update...
Probably I should provide different data to node_submit() beside giving the node->nid which means to Drupal it should update the node instead of adding a new one.
Revisions are disabled in my Drupal 5.1 site.
Anybody has seen this before?
Thanks: Marton
Updating data in CCK fields
Okay,
as usual I answer my own question first :)
The problem was: no VID was presented to node_submit() and node_save() !
Sorry: Marton
Import and Update CCK node type - How to construct node array
In return your forgiveness, I submit the way here I was through while trying to save foreign data to a CCK node type:
You can achieve it by setting variables in node.module's node_form_submit($form_id, $form_values) function for $form_values and $node after function node_submit($form_values). Show these variables in your page.tpl.php by getting them and print them by print_r().
After the structure of $form_values and $node object is clear, you can make an import function by constructing the $form_values array (as seen in Victor's guide above).
For inserting: give NO nid and vid to node_submit() function, so the core node save function will recognize this is a new node.
For updating: load neccessary fields from {node} table and additional CCK tables to check if node needs to be updated - beside node.nid and node.vid - than assign it to $form_values before you call node_submit(). This means to node save function it needs to update the node, instead of inserting a new one.
After constructing the $form_values array, you can call directly the function node_form_submit($form_id, $form_values), passing the array $form_values to it. Don't care about $form_id, Drupal doesn't seem to deal with it in this core function (why is it in there than?!).
Note: do not leave any additional code in core modules!
The "hack" part of this thing is you provide the input data by constructing the form values as Drupal would if you manually enter data when submitting a new CCK node. Drupal's node_form_submit() function does the real work after.
Cheeers: Marton
Node import 5.x error
I have installed Node import 5.x and trying to import csv data into the CCK node. I first hit with a problem with date import issue. I had installed date 2.x so i reverted to date 1.8. Finally, my imports started but a number of fields were not imported. The fields which were not imported were the select list fields.
Once i change the field type to text field the node import works properly and all data items are loaded.
It seems some issue of node import ->supported->cck
Any one has any solution.
Victor i own you a thanks for the solution of dealers data loaded in drupal. That part was done in 5.x or 4.x
Apologies if i barged into a wrong thread.
regards,
fgetcvs is coding, try modeling
Hi,
fgetcvs is ok for PHP programmer. Antoher solution is www.dbTube.org
You can model (draw) and reuse your import definition with a PHP application.
greetings